图像组成旨在通过将物体从一个图像插入另一个背景图像,其中插入物体的放置(例如,位置,尺寸,遮挡)的位置可以是不合理的,这将显着降低合成图像的质量。虽然有些作品试图学习对象放置以创建现实的合成图像,但它们并未专注于评估对象放置的合理性。在本文中,我们专注于对象放置评估任务,该任务验证了在对象放置方面是否是合理的合成图像。为完成此任务,我们构建由复合图像及其合理性标签组成的第一个对象放置评估(OPA)数据集。我们还为此任务提出了一个简单但有效的基线。数据集可在https://github.com/bcmi/object-placement-assessment-dataset-opa获得。
translated by 谷歌翻译
Current mainstream object detection methods for large aerial images usually divide large images into patches and then exhaustively detect the objects of interest on all patches, no matter whether there exist objects or not. This paradigm, although effective, is inefficient because the detectors have to go through all patches, severely hindering the inference speed. This paper presents an Objectness Activation Network (OAN) to help detectors focus on fewer patches but achieve more efficient inference and more accurate results, enabling a simple and effective solution to object detection in large images. In brief, OAN is a light fully-convolutional network for judging whether each patch contains objects or not, which can be easily integrated into many object detectors and jointly trained with them end-to-end. We extensively evaluate our OAN with five advanced detectors. Using OAN, all five detectors acquire more than 30.0% speed-up on three large-scale aerial image datasets, meanwhile with consistent accuracy improvements. On extremely large Gaofen-2 images (29200$\times$27620 pixels), our OAN improves the detection speed by 70.5%. Moreover, we extend our OAN to driving-scene object detection and 4K video object detection, boosting the detection speed by 112.1% and 75.0%, respectively, without sacrificing the accuracy. Code is available at https://github.com/Ranchosky/OAN.
translated by 谷歌翻译
In task-oriented dialogs such as MultiWoZ (Budzianowski et al., 2018), an informative and/or successful system response needs to include necessary key information such as the phone number of a hotel. Therefore, we hypothesize that by helping the model to focus more on learning key quantities in the dialog, the model can generative more informative and helpful responses. In this paper, we propose a new training algorithm, Reinforced Language Modeling (RLM), that aims to use a fine-grained reward function and reinforcement learning to help the model focus more on generating key quantities correctly during test time. Empirical results show our proposed RLM achieves state-of-the-art performance on the inform rate, success rate, and combined score in MultiWoZ.
translated by 谷歌翻译
A fundamental question in any peer-to-peer ride-sharing system is how to, both effectively and efficiently, meet the request of passengers to balance the supply and demand in real time. On the passenger side, traditional approaches focus on pricing strategies by increasing the probability of users' call to adjust the distribution of demand. However, previous methods do not take into account the impact of changes in strategy on future supply and demand changes, which means drivers are repositioned to different destinations due to passengers' calls, which will affect the driver's income for a period of time in the future. Motivated by this observation, we make an attempt to optimize the distribution of demand to handle this problem by learning the long-term spatio-temporal values as a guideline for pricing strategy. In this study, we propose an offline deep reinforcement learning based method focusing on the demand side to improve the utilization of transportation resources and customer satisfaction. We adopt a spatio-temporal learning method to learn the value of different time and location, then incentivize the ride requests of passengers to adjust the distribution of demand to balance the supply and demand in the system. In particular, we model the problem as a Markov Decision Process (MDP).
translated by 谷歌翻译
6G时代的语义沟通被认为是一个有希望的沟通范式,可以突破传统通信的瓶颈。但是,其在多用户方案中的应用程序,尤其是广播案例,仍未探索。为了有效利用语义沟通启用的好处,在本文中,我们提出了一个一对一的语义通信系统。具体而言,我们建议使用一个启用的深神经网络(DNN),称为MR \ _DeepSc。通过为不同用户的语义功能利用语义功能,基于预训练的模型即Distilbert的语义识别器是为了区分不同用户的。此外,采用转移学习来加快新接收器网络的培训。仿真结果表明,在不同的通道条件下,提出的MR \ _DeepSc可以比其他基准测试获得最佳性能,尤其是在低信噪比(SNR)方面。
translated by 谷歌翻译
变压器编码器模型在对话建模中显示出令人印象深刻的性能。但是,由于变压器在处理长序列方面效率低下,对话历史的长度通常需要被截断。为了解决此问题,我们提出了一种新的内存启动变压器,该变压器与现有的预训练编码器模型兼容,并可以有效地保存历史记录信息。它将单独的内存模块与预训练的变压器一起结合在一起,以在内存状态和当前输入上下文之间有效互换信息。我们在三个对话数据集和两个语言建模数据集上评估我们的模型。实验结果表明,与其他预训练的变压器基线相比,我们的方法已经达到了较高的效率和性能。
translated by 谷歌翻译
为什么网络根本有负权重?答案是:了解更多功能。我们从数学上证明,具有所有非负权重的深神经网络不是通用近似值。许多深度学习文献都假设了这种基本结果,而没有以前证明结果并证明其必要性。
translated by 谷歌翻译
由于对个人数据隐私的不断增长和当地客户的迅速增长的数据量,Federated Learnated(FL)的动机已成为新的机器学习设置。 FL系统由中央参数服务器和多个本地客户端组成。它将数据保留在本地客户端,并通过共享本地学到的模型参数来学习集中式模型。不需要共享本地数据,并且可以很好地保护隐私。然而,由于它是模型而不是共享的原始数据,因此系统可以暴露于恶意客户端发起的中毒模型攻击。此外,由于服务器上没有本地客户端数据,因此确定恶意客户端是一项挑战。此外,仍然可以使用上载模型估算客户本地数据,从而导致隐私披露。在这项工作中,我们首先提出了一个基于模型更新的联合平均算法,以防御拜占庭式攻击,例如加性噪声攻击和弹药攻击。提出了单个客户模型初始化方法,以通过隐藏各个本地机器学习模型来提供进一步的隐私保护。在结合这两个方案时,隐私和安全性都可以有效地增强。当没有攻击时,提出的方案被证明在非IID数据分布下实验会收敛。在拜占庭式攻击下,提议的方案的表现要比基于经典模型的FedAvg算法要好得多。
translated by 谷歌翻译
尽管最近取得了成功,但基于学习的深度学习方法用于预测身体运动下的3D服装变形,却遇到了服装与身体之间的互穿问题。为了解决这个问题,我们提出了一种新颖的碰撞处理神经网络层,称为排斥力单位(REFU)。根据基础主体的签名距离函数(SDF)和当前的服装顶点位置,Repu预测了将任何互穿顶点推向无冲突的配置,同时保留精细的几何学细节,这些偏移量将任何互穿顶点推向无冲突的配置。我们表明,RECU可以通过可训练的参数进行区分,并且可以集成到预测3D服装变形的不同网络骨架中。我们的实验表明,与基于碰撞损失或后处理优化的先前方法相比,相比,RECU可显着减少身体与服装之间的碰撞数量,并更好地保留几何细节。
translated by 谷歌翻译
视频中的自动烟熏车辆检测是用于传统昂贵的遥感遥控器,其中具有紫外线的紫外线设备,用于环境保护机构。但是,将车辆烟雾与后车辆或混乱道路的阴影和湿区域区分开来是一项挑战,并且由于注释数据有限,可能会更糟。在本文中,我们首先引入了一个现实世界中的大型烟熏车数据集,其中有75,000个带注释的烟熏车像图像,从而有助于对先进的深度学习模型进行有效的培训。为了启用公平算法比较,我们还构建了一个烟熏车视频数据集,其中包括163个带有细分级注释的长视频。此外,我们提出了一个新的粗到烟熏车辆检测(代码)框架,以进行有效的烟熏车辆检测。这些代码首先利用轻质的Yolo检测器以高召回率进行快速烟雾检测,然后采用烟极车匹配策略来消除非车辆烟雾,并最终使用精心设计的3D模型进一步完善结果,以进一步完善结果。空间时间空间。四个指标的广泛实验表明,我们的框架比基于手工的特征方法和最新的高级方法要优越。代码和数据集将在https://github.com/pengxj/smokyvehicle上发布。
translated by 谷歌翻译